88 results found.
Written
Evaluation Data,
Language Type:
Multilingual
Languages:
Finnish Italian Japanese Mandarin Chinese Polish
Availability:
Freely Available
License:
CreativeCommons
Size:
129,188 entries Production Status:
Newly created-finished
Use:
Evaluation/Validation
-
Paper title:Manual Clustering and Spatial Arrangement of Verbs for Multilingual Evaluation and Typology Analysis
-
Paper track:Long paper/
-
Paper status:Accept Oral
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Olga Majewska | Multi-SpA-Verb | /N |
Documentation:
English documentation, publicly available
Written
Corpus,
Language Type:
Multilingual
Languages:
German Hindi Italian Spanish Swedish
Availability:
Freely Available
License:
OpenSource
Size:
184880 sentences Production Status:
Existing-updated
Use:
Parsing and Tagging
-
Paper title:Semi-Supervised Dependency Parsing with Arc-Factored Variational Autoencoding
-
Paper track:Long paper/
-
Paper status:Accept Oral
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Ge Wang | Universal Dependencies | /N |
Documentation:
None
Written
Corpus,
Language Type:
Multilingual
Languages:
English French German Italian Portuguese Spanish
Availability:
Freely Available
License:
CreativeCommons
Size:
multilingual word embeddings in 30 languages and 110 bilingual dictionaries Production Status:
Existing-used
Use:
Machine Translation, SpeechToSpeech Translation
-
Paper title:A Locally Linear Procedure for Word Translation
-
Paper track:Short paper/
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Soham Dan | MUSE | /N |
Documentation:
https://github.com/facebookresearch/MUSE/blob/master/README.md
Written
Lexicon,
Language Type:
Multilingual
Languages:
English French German Italian Spanish
Availability:
Freely Available
License:
Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0)
Size:
None Production Status:
Newly created-in progress
Use:
Word Sense Disambiguation
-
Paper title:Clu{BERT}: {A} Cluster-Based Approach for Learning Sense Distributions in Multiple Languages
-
Paper track:Long/Semantics: Lexical
-
Paper status:Accept
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Bianca Scarlini | CluBERT Distributions | /N |
Documentation:
None
Written
Corpus,
Language Type:
Monolingual
Languages:
Bulgarian Croatian Czech Danish Dutch English Estonian Finnish French German Greek Hungarian Icelandic Irish Italian Latvian Lithuanian Maltese Polish Portuguese Romanian Slovak Slovenian Spanish Swedish
Availability:
Freely Available
License:
CC-0
Size:
341856530 sentences Production Status:
Newly created-in progress
Use:
Machine Translation, SpeechToSpeech Translation
-
Paper title:ParaCrawl: Web-Scale Acquisition of Parallel Corpora
-
Paper track:Long/Resources and Evaluation
-
Paper status:Accept
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Philipp Koehn | ParaCrawl | /N |
Documentation:
None
Written
Treebank,
Language Type:
Multilingual
Languages:
Chinese English French German Italian Japanese Russian Spanish
Availability:
Freely Available
License:
CreativeCommons
Size:
None Production Status:
Existing-used
Use:
Parsing and Tagging
-
Paper title:Why Overfitting Isn't Always Bad: Retrofitting Cross-Lingual Word Embeddings to Dictionaries
-
Paper track:Short/Machine Learning for NLP
-
Paper status:Accept
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Mozhi Zhang | Universal Dependencies | /N |
Documentation:
None
Written
Evaluation Data,
Language Type:
Multilingual
Languages:
Chinese English French German Italian Japanese Russian Spanish
Availability:
From NIST
License:
Size:
None Production Status:
Existing-used
Use:
Document Classification, Text categorisation
-
Paper title:Why Overfitting Isn't Always Bad: Retrofitting Cross-Lingual Word Embeddings to Dictionaries
-
Paper track:Short/Machine Learning for NLP
-
Paper status:Accept
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Mozhi Zhang | Reuters RCV1/RCV2 Multilingual Corpus | /N |
Documentation:
None
Speech/Written
Corpus,
Language Type:
Multilingual
Languages:
English French Italian Spanish
Availability:
Freely Available
License:
CreativeCommons BY NC ND 4.0 International
Size:
3370 <audio-transcript-translation> triplets OtherProduction Status:
Newly created-finished
Use:
Machine Translation, SpeechToSpeech Translation
-
Paper title:Gender in Danger? Evaluating Speech Translation Technology on the MuST-SHE Corpus
-
Paper track:Long/Machine Translation
-
Paper status:Accept
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Marco Turchi | MuST-SHE | /N |
Documentation:
None
Dysarthric and Healthy Speech
Corpus,
Language Type:
Monolingual
Languages:
Italian
Availability:
Freely Available
License:
Size:
179 MByteProduction Status:
Newly created-in progress
Use:
Speech Recognition/Understanding
-
Paper title:EasyCall corpus: a dysarthric speech dataset
-
Paper track:13.7 Dysarthric speech/Poster Presentation
-
Paper status:Accept
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Rosanna Turrisi | EasyCall corpus | /N |
Documentation:
None
Speech/Written
Corpus,
Language Type:
Bilingual
Languages:
French Italian
Availability:
From Data Center(s)
License:
ELRA non commercial use, ELRA commercial use, ELRA evaluation use
Size:
90.5 hoursProduction Status:
Existing-used
Use:
Information Extraction, Information Retrieval
-
Paper title:LeBenchmark: A Reproducible Framework for Assessing Self-Supervised Representation Learning from Speech
-
Paper track:8.1 Feature extraction and low-level feature model/Oral Presentation
-
Paper status:Accept
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Laurent Besacier | PORTMEDIA | /N |
Documentation:
Article link: https://www.researchgate.net/publication/225285476_Robustesse_et_portabilites_multilingue_et_multi-domaines_des_systemes_de_comprehension_de_la_parole_le_projet_PortMedia, french, public




